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U.S.C. § 1 19(e), and international application No. PCTAJS02/22598, filed July 16, 
2002 in English under 35 U.S.C. § 120, the entirety of which is incorporated by 
reference herein. 

Government Grants 

At least part of the work contained in this application was performed under 
government grant HG00041 from the National Listitutes of Health. The government 
may have certain rights in this invention. 

Field of the Invention 

The invention relates to non-affinity based stable isotope tags and methods of 
using these for quantitative protein expression profiling. 

Background of the Invention 

Proteins are essential for the control and execution of virtually every 
biological process. Protein function is not necessarily a direct manifestation of the 
expression level of a corresponding mRNA transcript in a cell, but is impacted by 
post-translational modifications, such as protein phosphorylation, and the association 
of proteins with other biomolecules. It is therefore essential that a complete 
description of a biological system include measurements that indicate the identity, 
quantity and the state of activity of the proteins which constitute the system. The 
large-scale analysis of proteins expressed in a cell or tissue has been termed proteome 
analysis (Pennington et al., 1997). 

At present no protein analytical technology approaches the throughput and 
level of automation of genomic technology. The most common implementation of 
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proteome analysis is based on the separation of complex protein samples, most 
commonly by two-dimensional gel electrophoresis (2DE), and the subsequent 
sequential identification of the separated protein species (Ducret et al., 1998; Garrels 
et al., 1997; Link et al., 1997; Shevchenko et al., 1996; Gygi et al. 1999; Boucherie et 
al., 1 996). This approach has been revolutionized by the development of powerful 
mass spectrometric techniques and the development of computer algorithms which 
correlate protein and peptide mass spectral data with sequence databases and thus 
rapidly and conclusively identify proteins (Eng et al., 1994; Mann and Wilm, 1994; 
Yates etal., 1995). 

This technology has reached a level of sensitivity which now permits the 
identification of essentially any protein which is detectable by conventional protein 
staining methods including silver staining ^igeys and Aebersold, 1998; Figeys et al., 
1996; Figeys et al., 1997; Shevchenko et al., 1996). However, the sequential manner 
in which samples are processed limits the sample throughput, the most sensitive 
methods have been difficult to automate and low abundance proteins, such as 
regulatory proteins, escape detection without prior enrichment, thus effectively 
limiting the dynamic range of the technique. 

The development of methods and instrumentation for automated, data- 
dependent electrospray ionization (ESI) tandem mass spectrometry (MS/MS) in 
conjunction with microcapillary liquid chromatography (LC) and database searching 
has significantly increased the sensitivity and speed of the identification of gel- 
separated proteins. Microcapillary LC-MS/MS has been used successfully for the 
large-scale identification of individual proteins directly from mixtures without gel 
electrophoretic separation (Link et al., 1999; Opitek et aL, 1997). However, while 
these approaches dramatically accelerate protein identification, quantities of the 
analyzed proteins cannot be easily determined, and these methods have not been 
shown to substantially alleviate the dynamic range problem also encountered by the 
2DE/MS/MS approach. Therefore, low abundance proteins in complex samples are 
also difficult to analyze by the microcapillary LC/MS/MS method without their prior 
enrichment. 

There is thus a need to provide methods for the accurate comparison of protein 
expression levels between cells in two different states, particularly for comparison of 
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low abundance proteins. ICAT reagent technology makes use of a class of 
chemical reagents called isotope coded affinity tags (ICAT). These reagents exist in 
isotopically heavy and light forms which are chemically identical with the exception 
of eight deuterium or hydrogen atoms, respectively. Proteins from two cells lysates 
can be labeled independently with one or the other ICAT reagent at cysteinyl residues. 
After mixing and proteolysing the lysates, the ICAT-labeled peptides are isolated by 
affinity to a biotin molecule incorporated into each ICAT reagent. ICAT-labeled 
peptides are analyzed by LC-MS/MS where they elute as heavy and light pairs of 
peptides. Quantification is performed by determining the relative expression ratio 
relating to the amount of each ICAT-labeled peptide pair in the sample. 

Identification of each ICAT-labeled peptide is performed by a second stage of 
mass spectrometry (MS/MS) and sequence database searching. The end result is 
relative protein expression ratios on a large scale. The major drawback to this 
technique are 1) quantification is only relative; 2) specialized chemistry is required, 
and 3) database searches are hindered by the presence of the large ICAT reagent 
molecule, and 4) relative amounts of posttranslationally modified (e.g., 
phosphorylated) proteins are transparent to analysis. 

Summary of the Invention 

It is an object of the invention to provide improved chemistry, reagents, and 
kits for accurate quantification of proteins. In one preferred aspect, proteins can be 
quantitated directly firom cell lysates. The reagents can be used for the rapid and 
quantitative analysis of protein in mixtures of proteins, e.g., to profile the proteome of 
a cell at a particular cell state. 

In one aspect, the invention provides a reagent for mass spectrometric analysis 
of proteins comprising a tag molecule. Preferably, the tag molecule comprises a 
reactive site for stably associating with a protein, an isotope label, and an anchoring 
site for anchoring the tag molecule to a solid phase. Anchoring may be direct, e.g., as 
a consequence of a covalent or non-covalent bond between the anchoring site of the 
tag and the solid phase, or indirect, through a linker which can be cleaved from the tag 
molecule. 
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In one preferred aspect, the anchoring site of the tag molecule forms a pH 
sensitive bond with the solid phase. Preferably, the anchoring site forms covalent 
bonds to a cis hydroxyl pair on the solid phase under selected pH conditions and can 
be disassociated from the solid phase by changing those conditions. 

In another aspect, the tag molecule comprises the general formula R-B(OH2), 
wherein the R group is a suitable chemical moiety for attaching the isotope. Suitable 
R groups include, but are not limited to: an alkyl group, aryl group, heteroaryl group, 
arylalkyl group, heteroarylalkyl group, and a cyclic molecule. In a further aspect, the 
tag molecule is phenyl-B(OH)2. 

Preferred isotopes are stable isotopes selected from the group consisting of a 
stable isotope of hydrogen, nitrogen, oxygen, carbon, phosphorous and sulfur. 

Reactive site groups include, but are not limited to chemical moieties that 
react with sulfhydryl groups, amino groups, carboxylate groups, ester groups, 
phosphate groups, aldehyde groups, ketone groups and with homoserine lactone after 
fragmentation with CNBr. Sites on proteins may be naturally reactive with reactive 
site groups or can be made reactive upon exposure to an agent (e.g., an alkylating 
agent, a reducing agent, etc). 

In one aspect, the reactive site group of the tag molecule forms a stable 
association with a modified residue of a protein. The modified residue may be 
glycosylated, methylated, acylated, phosphorylated, ubiquinated, famesylated, or 
ribosylated. 

The pH sensitive anchoring group of a tag molecule forms a bond with a solid 
phase under selected pH conditions. Examples of pH sensitive bonds include, but are 
not limited to: acyloxyalkyl ether bonds, acetal bonds, thioacetal bonds, aminal bonds, 
imine bonds, carbonate bonds, and ketal bonds. 

The invention also provides a composition comprising a pair of tag molecules 
as described above, where each member of the pair is identical except for the mass of 
the isotope attached thereto. For example, one member of the pair comprises a heavy 
isotope and the other member of the pair comprises the corresponding light form of 
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the isotope. Alternatively, one member of the pair may be labeled while the other 
member is not. 

The invention further provides a kit comprising reagents and/or compositions 
as described above, and one or more of a reagent selected from the group consisting 
of: an activating agent for providing active groups on a protein which bind to the 
reactive site of the tag molecule; a solid phase; one or more agents for lysing a cell; a 
pH altering agent; one or more proteases; one or more cell samples or fractions 
thereof. The tag molecule may ftirther be stably associated with a peptide. 

The invention also provides kits comprising a plurality of tagged peptide 
molecules, each tagged peptide molecule comprising a peptide and a tag molecule 
stably associated with the protein, the tag molecule further comprising an isotope 
label, and a pH sensitive anchoring site for anchoring the tag molecule to a solid 
phase. In one aspect, the kit comprises pairs of tagged peptides and each member of a 
pair of tagged peptides comprises an identical peptide and is differentially labeled 
from the other member of the pair. In another aspect, the kit comprises at least one 
set of tagged peptides, the set comprising different peptides corresponding to a single 
protein. In still another aspect, at least one set of tagged peptides comprises peptides 
corresponding to modified and unmodified forms of a single protein. In a ftirther 
aspect, the kit comprises at least one set of tagged peptides from a first cell at a first 
cell state and at least one set of tagged peptides from a second cell at a second cell 
state. For example, the first cell may be a normally proliferating cell while the second 
cell is an abnormally proliferating cell (e.g., a cancer cell). First and second cells may 
also represent different stages of cancer. 

The invention additionally provides a method for identifying one or more 
proteins or protein functions in one or more samples containing mixtures of proteins. 
In one aspect, the method comprises: reacting a first sample with any of the reagents 
described above and a solid phase under conditions suitable to form a solid phase- 
isotope labeled tag molecule-protein complex. The complex is exposed to one or 
more proteases, generating solid phase-isotope labeled tag molecule-peptide 
complexes and untagged peptides. The solid phase-isotope labeled tag molecule- 
peptide complexes are purified from untagged peptides and exposed to a pH which 
disrupts associations between the anchoring site of the tag molecule and the solid 
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phase, thereby releasing tagged peptides from the solid phase. Preferably, the sample 
is subjected to a separation step such as liquid chromatography. The mass of the 
tagged peptide is determined and correlated with the identity and/or activity of a 
protein (e.g., the presence of a particular modified form of a protein which is known 
to be active). Preferably, a mass-to-charge ratio is determined, e.g., by multistage 
mass spectrometric (MS"^ analysis. In addition to determining the identity of a 
protein, a quantitative measure of the amount of protein in the sample may be 
obtained. The method may also be used to determine the site of a modification of a 
protein in one or more samples, by reacting sample proteins with a tag molecule 
comprising a reactive site which reacts with a modified residue on the protein. In 
another aspect, the amount of a modified protein in a sample is also determined. 

In a further aspect, the method further comprises reacting a second, sample 
with a second reagent comprising an identical molecular tag as the reagent used in the 
first sample but which is differentially labeled. Samples are processed in parallel and 
combined prior to protease digestion. This generates a combined sample comprising 
at least one pair of tagged peptides, each member of the pair comprising identical 
peptides but differing in mass. The ratio of members of at least one tagged peptide 
pair in the combined sample is determined. Preferably, mass spectra are generated. 
Such spectra will comprise at least one signal doublet for each peptide in the sample, 
the signal doublet comprising a first signal and a second signal shifted a number of 
known units from the first signal. The known units will represent the difference in 
molecular weight between the two members of a tagged peptide pair. Preferably, a 
signal ratio for a given peptide is determined by relating the difference in signal 
intensity between the first signal and the second signal. 

The relative amounts of members of a tagged peptide pair in the two samples 
are determined and correlated with the abundance the protein corresponding to the 
peptide in the sample. Abundance may be correlated with the state of cells from 
which the samples were obtained. The correlation may be used to diagnose a 
pathological condition in a patient from whom one of the cell samples was obtained 
(e.g., where one of the cell states represent a disease condition). 

Single samples or multiple samples may be analyzed by relating mass spectra 
data from a tagged peptide to an amino acid sequence. The steps of the method can 
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be repeated, either sequentially or simultaneously, until substantially all of the 
proteins in a sample are detected and/or identified. In this way a proteome profile for 
one or more cells can be obtained. 

Brief Description of the Figures 

The objects and features of the invention can be better understood with 
reference to the following detailed description and accompanying drawings. 

Figure 1 is a schematic diagram illustrating the use of resin-based chemistries 
to tag peptides according to one aspect of the invention. 

Figure 2 shows exemplary cleavable linkers that can be used in the method 
shown in Figure 1 . 

Figure 3 shows the use of arylboronic acids for protein quantification 
according to one aspect of the invention. 

Figure 4 shows the elution profile for a carbohydrate affinity column 
demonstrating pH sensitive attachment of boron-based tag molecules. 

Figures 5A and B show two strategies for capturing and labeling cysteine- 
containing peptides. Figure 5A shows the use of a boron-based molecular tag which 
binds to a resin support comprising cis hydroxy groups presented by a 5-membered 
cyclic ring compound via the two hydroxy groups on the tag. The tag binds to 
proteins via a cysteine reactive moiety. Figure 5B shows the use of the 5-membered 
cyclic ring as the tag molecule and the use of R-B(OH2) as the molecule which 
presents cis hydroxy groups to capture the tag molecule. 

Detailed Description 

The invention provides non-affinity based isotope tagged peptides, chemistries 
for making these peptides, and methods for using these peptides. In one aspect, tags 
comprise a reactive site (RS) for reacting with a molecule on a protein to form a stable 
association with the peptide (e.g., a covalent bond) and an anchoring site (AS) group 
for reversibly or removably anchoring the tag to a solid phase such as a resin support. 
Anchoring may be direct or indirect (e.g., through a linker molecule). Preferably, the 
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tag comprises a mass-altering label, such as a stable isotope, such that association of 
the tag with the peptide can be monitored by mass spectrometry. The reagents can be 
used for rapid and quantitative analysis of proteins or protein function in mixtures of 
proteins. 

Definitions 

The following definitions are provided for specific terms which are used in the 
following written description. 

As used in the specification and claims, the singular form "a", "an" and "the" 
include plural references unless the context clearly dictates otherwise. For example, 
the temi "a cell" includes a plurality of cells, including mixtures thereof The term "a 
protein" includes a plurality of proteins. 

"Protein", as used herein, means any protein, including, but not limited to 
peptides, enzymes, glycoproteins, hormones, receptors, antigens, antibodies, growth 
factors, etc., without limitation. Presently preferred proteins include those comprised 
of at least 25 amino acid residues, more preferably at least 35 amino acid residues and 
still more preferably at least 50 amino acid residues. 

As used herein, the term "peptide" refers to a compound of two or more 
subunit amino acids. The subunits are linked by peptide bonds. 

As used herein, the term "alkyl" refers to univalent groups derived firom 
alkanes by removal of a hydrogen atom fi-om any carbon atom: CnHzn+r- The groups 
derived by removal of a hydrogen atom fi-om a terminal carbon atom of unbranched 
alkanes form a subclass of nomial alkyl (/i-alkyl) groups: H[CH2]n-. The groups 
RCH2-, R2CH- (R not equal to H), and R3G- (R not equal to H) are primary, 
secondary and tertiary alkyl groups respectively. C(l-22)alkyl refers to any alkyl 
group having ft^om 1 to 22 carbon atoms and includes C(l-6)alkyl, such as methyl, 
ethyl, propyl, iso-propyl, butyl, pentyl and hexyl and all possible isomers thereof By 
"lower alkyl" is meant C(l-6)alkyl, preferably C(l-4)alkyl, more preferably, methyl 
and ethyl. 
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As used herein, the terms "aryl" and "heteroaryl" mean a 5- or 6-membered 
aromatic or heteroaromatic ring containing 0-3 heteroatoms selected from O, N, or S; 
a bicyclic 9- or 10-membered aromatic or heteroaromatic ring system containing 0- 3 
heteroatoms selected from O, N, or S; or a tricyclic 13- or 14-membered aromatic or 
heteroaromatic ring system containing 0-3 heteroatoms selected from O, N, or S; each 
of which rings is optionally substituted with 1-3 lower alkyl, substituted alkyl, 
substituted alkynyl, - NO2, halogen, hydroxy, alkoxy, OCH(COOH)2, cyano, ~NZZ, 
acylamino, phenyl, benzyl, phenoxy, benzyloxy, heteroaryl, or heteroaryloxy; each of 
said phenyl, benzyl, phenoxy, benzyloxy, heteroaryl, and heteroaryloxy is optionally 
substituted with 1-3 substituents selected from lower alkyl, alkenyl, alkynyl, halogen, 
hydroxy, alkoxy, cyano, phenyl, benzyl, benzyloxy, carboxamido, heteroaryl, 
heteroaryloxy, — NO2 or — NZZ (wherein Z is independently H, lower alkyl or 
cycloalkyl, and -ZZ may be fused to form a cyclic ring with nitrogen). 

"Arylalkyl" means an alkyl residue attached to an aryl ring. Examples are 
benzyl, phenethyl and the like. 

"Heteroaryl alkyl" means an alkyl residue attached to a heteroaryl ring. 
Examples include, e.g., pjnidinylmethyl, pyrimidinylethyl and the like. 

"Substituted" alkyl groups mean alkyls where up to three H atoms on each C 
atom therein are replaced with halogen, hydroxy, lower alkoxy, carboxy, carboalkoxy, 
carboxamido, cyano, carbonyl, —NO 2, — NZZ; alkylthio, sulfoxide, sulfone, 
acylamino, amidino, phenyl, benzyl, heteroaryl, phenoxy, benzyloxy, heteroaryloxy, 
or substituted phenyl, benzyl, heteroaryl, phenoxy, benzyloxy, or heteroaryloxy. 

An "amide" refers to an —0(0)— NH— , where Z is alkyl, aryl, alklyaryl or 
hydrogen. 

A "thioamide" refers to — C(S)— NH— Z, where Z is alkyl, aryl, alklyaryl or 
hydrogen. 

An "ester" refers to an -C(0)— OZ', where Z' is alkyl, aryl, or alklyaryL 

An "amine" refers to a ~N(Z' )Z' *, where Z' and Z* is independently 
hydrogen, alkyl, aryl, or alklyaryl, provided that T and T ' are not both hydrogen. 
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An "ether" refers to Z-O-Z, where Z is either alkyl, aryl, or alkylaryl, 

A "thioether" refers to Z-S-Z, where Z is either alkyl, aryl, or alkylaryl. 

A "cyclic molecule" is a molecule which has at least one chemical moiety 
which forms a ring. The ring may contain three atoms or more. The molecule may 
contain more than one cyclic moiety, the cyclic moieties may be the same or different. 

Tag Molecules 

Generally, tag molecules according to the invention comprise the formula: 

AS-R*-RS, 

where RS represents a reactive site group for reacting with a protein or peptide, AS 
represents an anchoring site group for stably associating the tag with a solid phase and 
R represents the backbone of the tag molecule to which the isotope label (*) is 
attached. As used herein, "stable" refers to an association which remains intact after 
extensive and multiple washings with a variety of solutions to remove non- 
specifically bound components. 

The tag may be stably associated with a solid phase (SP) either directly as 

SP-AS-R*-RS, 

where " — " between SP and AS represents a covalent bond. Preferably, this 
bond is pH sensitive. 

Alternatively, the tag may be stably associated with the solid phase as 

SP— L-AS-R*-RS, 

where Lisa cleavable linker molecule with at least one cleavage site which can 
separate the linker from the tag molecule. 

Reactive Site Groups 

The reactive site of a tag molecule is a group that selectively reacts with 
certain protein functional groups or is a substrate or cofactor of an enzyme of interest. 
Preferably, the reactive group of the tag molecule reacts with a plurality of different 
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types of cellular proteins. Reaction of the RS of the tag molecule with functional 
groups on the protein should occur under conditions that do not lead to substantial 
degradation of the compounds in the sample to be analyzed. Examples of RS groups 
include, but are not limited to those which react with sulfhydryl groups to tag proteins 
containing cysteine, those that react with amino groups, carboxylate groups, ester 
groups, phosphate reactive groups, and aldehyde and/or ketone reactive groups or, 
after fragmentation with CNBr, with homoserine lactone. 

Cysteine reactive groups include, but are not limited to, epoxides, alpha- 
haloacyl groups, nitriles, sulfonated alkyl or aryl thiols and maleimides. Amino 
reactive groups tag amino groups in proteins and include sulfonyl halides, 
isocyanates, isothiocyanantes, active esters, including tetrafluorophenyl esters, and N- 
hydroxysuccinimidyl esters, acid halides, and acid anyhydrides. In addition, amino 
reactive groups include aldehydes or ketones in the presence or absence of NaBH4 or 
NaCNBHs. Figure 2 shows exemplary cysteine reactive groups on an arylboronic 
acid tag. 

Carboxylic acid reactive groups include amines or alcohols which become 
reactive in the presence of a coupling agent such as dicyclohexylcarbodiimide, or 
2,3,5,6'-tetrafluorophenyl trifluoroacetate and in the presence or absence of a coupling 
catalyst such as 4-dimethylaminopyridine; and transition metal-diamine complexes 
including Cu(II)phenanthroline. 

Ester reactive groups include amines which, for example, react with 
homoserine lactone. 

Phosphate reactive groups include chelated metal where the metal is, for 
example Fe(III) or Ga(III), chelated to, for example, nitrilotriacetiac acid or 
iminodiacetic acid. 

Aldehyde or ketone reactive groups include amine plus NaBHj or NaCNBH3, 
or these reagents after first treating a carbohydrate with periodate to generate an 
aldehyde or ketone. 

RS groups can also be substrates for a selected enzyme of interest. The 
enzyme of interest may, for example, be one that is associated with a disease state or 
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birth defect or one that is routinely assayed for medical purposes. Enzyme substrates 
of interest for use with the methods of this invention include, acid phosphatase, 
alkaline phosphatase, alanine aminotransferase, amylase, angiotensin converting 
enzyme, aspartate aminotransferase, creatine kinase, gamma-glutamyltransferase, 
lipase, lactate dehydrogenase, and glucose-6-phosphate dehydrogenase which are 
currently routinely assayed for. 

Anchoring Sites 

The tags according to the invention further comprise an anchoring site for 
forming stable associations with a solid phase. Tags are either reversibly anchored 
(e.g., can associate and dissociate from the solid phase depending on solution 
conditions, such as pH) or removably anchored (e.g., can be disassociated from the 
support but unable to reattach under any condition). Stable associations can include 
covalent or non-covalent bonds and, and as discussed above, may be direct (i.e., the 
tag may bind covalently or non-covalently to the solid phase) or indirect (i.e., the tag. 
may bind covalently or non-covalently to a linker molecule which itself forms direct 
stable associations with the solid phase). In this latter scenario, the anchoring site of 
the tag molecule is the site on the molecule which stably associates with the linker. In 
one preferred aspect, tags are anchored to solid supports by pH sensitive covalent 
bonds. 

Tags according to the invention bind minimally and preferably, not at all, to 
components in the assay system, except the solid phase, and do not significantly bind 
to surfaces of reaction vessels. Any non-specific interaction of the affinity tag with 
other components or surfaces should be disrupted by multiple washes that leave 
association between the tag and solid phase intact. The tag preferably does not 
undergo peptide-like fragmentation during (MS)" analysis. The tag is preferably 
soluble in the sample liquid to be analyzed even though attached to a solid phase 
comprising an insoluble resin such as agarose. 

The tag molecule preferably also contains groups or moieties that facilitate 
ionization of tagged peptides. For example, the tag molecule may contain acidic or 
basic groups, e.g., COOH, SO3H, primary, secondary or tertiary amino groups, 
nitrogen-heterocycles, ethers, or combinations of these groups. The tag molecule may 
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also contain groups having a pennanent charge, e.g., phosphonium groups, quaternary 
ammonium groups, sulfonium groups, chelated metal ions, tetralky or tetraryl borate 
or stable carbanions. 

Cleavable Linkers 

In one aspect, a tag is associated indirectly with a solid phase through a linker 
molecule. As used herein, a "linker" refers to a bifunctional chemical moiety which 
comprises an end for stably associating with a solid phase and an end for stably 
associating with the tag. In one preferred aspect, the linker is cleavable. As used 
herein, the term "cleavage" refers to a process of releasing a material or compound 
from a solid support, e.g., to permit analysis of the compound by solution-phase 
methods. See, e.g.. Wells et al. (1998), 7. Org. Chem, 63:6430-6431. 

The linker group should be soluble in the sample liquid to be analyzed and 
should be stable with respect to chemical reaction, e.g., substantially chemically inert, 
with respect to components of the sample. Preferably, the linker does not interact 
with the tag molecule except at the tag molecule's anchoring site and does not interact 
with the support except at the end of the linker which forms stable associations with 
the support. Any non-specific interactions of the linker should be broken after 
multiple washes which leave the solid phaserlinkeritag molecule (+ peptide) complex 
intact. Linkers preferably do not undergo peptide-like fragmentation during (MS)" 
analysis. 

Exemplary linker molecules are shown in Figure 3. As can be seen from the 
Figure, the exact chemical structure of the linker can vary to allow cleavage to be 
controlled in a manner suiting a particular assay format and to allow coupling to a 
particular tag molecule. Thus, the linker can be cleavable by chemical, thermal or 
photochemical reaction. Photocleavable groups in the linker may include, but are not 
limited to, l-(2-nitrophenyl)-ethyl groups. Thermally labile linkers may include, but 
are not limited to, a double-stranded duplex formed from two complementary strands 
of nucleic acid, a strand of a nucleic acid with a complementary strand of a peptide 
nucleic acid, or two complementary peptide nucleic acid strands which will dissociate 
upon heating. 
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Cleavable linkers also include those having disulfide bonds, acid or base labile 
.groups, including among others, diarylmethyl or trimethylarylmethyl groups, silyl 
ethers, carbamates, oxyesters, ethers, polyethers, diamines, ether diamines, polyether 
diamines, amides, polyamides, polythioethers, disulfides, silyl ethers, alkyl or alkenyl 
chains (straight chain or branched and portions of which may be cyclic) aryl, diaryl or 
alkyl-aryl groups, amides, polyamides, and esters. Enzymatically cleavable linkers 
include, but are not limited to, protease-sensitive amides or esters, beta-lactamase- 
sensitive beta-lactam analogs and linkers that are nuclease-cleavable, or glycosidase- 
cleavable. 

While normally amino acids and oligopeptides are not preferred, when used 
they will normally employ amino acids of from 2-3 carbon atoms, i.e. glycine and 
alanine. Aryl groups in linkers can contain one or more heteroatoms (e.g., N, O or S 
atoms). Linkages also include substituted benzyl ethers, esters, acetals or ketals, diols, 
and the like (See, U.S. Pat. No. 5,789,172 for a list of usefiil functionalities and 
manner of cleavage, herein incorporated by reference). The linkers, when other than a 
bond, will have from about 1 to 60 atoms, usually 1 to 30 atoms, where the atoms 
include C, N, O, S, P, etc., particularly C, N and O, and will generally have from 
about 1 to 12 carbon atoms and from about 0 to 8, usually 0 to 6 heteroatoms. The 
atoms are exclusive of hydrogen in referring to the number of atoms in a group, unless 
indicated otherwise. 

Additional types of linker molecules are described in, e.g., Backes and Ellman 
(1997) Curr, Opin, Chem. Biol 1 : 86-93, Backes et al. (1996), J. Amer, Chem, Soc, 
118:3055-3056, Backes and Ellman (1994),/. Amer. Chem. Soc, 116:11171-11172, 
Hoffmann and Frank (1994), Tetrahedron Lett 35:7763-7766, Kocis et al. (1993), 
Tetrahedron Lett, 34:7251-7252, and Plunkett and Ellman (1995), J. Org. Chem, 
60:6006-6007. 

In contrast to affinity-based tag molecules, such as ICAT™ reagents, tag 
molecules stably associated with linker molecules are generally not displaceable from 
the solid phase by addition of a displacing ligand or by changing solvent, and the 
cleavage site of the linker is generally distal from the support and proximal to the tag 
molecule. 
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pH Sensitive Anchoring Sites 

In another aspect, the tag comprises a molecule with a pH sensitive anchoring 
site. Examples of such tags are shown in Figure 2. In one preferred aspect, such a tag 
minimally comprises R-B(OH2), where the R group is a suitable chemical moiety for 
attaching a label such as a stable isotope. In one embodiment, R is a source of n 
electrons, i.e., is sp2-bonded to B. Therefore, preferably, R is an aromatic group such 
as a phenyl molecule. An exemplary tag molecule includes, but is not limited to, 
phenyl-B(OH)2. 

Additionally, the tag molecule comprises an RS group, preferably, covalently 
bound to the R group and distal from the -OH anchor site groups. In one preferred 
embodiment, the RS group comprises a cysteine-reactive moiety such as the group 
shown in Figure 2. However, generally, any of the RS groups described above may 
also be used as RS groups. 

Additional molecules may present between the RS group and R group; 
however, preferably, the tag molecule is of a suitable size to facilitate mass 
spectrometric analysis. 

Though boron may be supplied in a variety of ways, it must be present as 
borate ions in order to bind to a solid phase support (e.g., such as a polysaccharide- 
containing support). According to D. J. Doonan and L. D. Lower (""Boron 
Compounds (Oxide, Acids, Borates)", in Kirk-Othmer Encyclopedia of Chemical 
Technology, Vol. 4, p. 67-1 10, 3rd ed., 1978), boric acid, borate ion and polyions 
containing various amounts of boron, oxygen, and hydroxyl groups exist in dynamic 
equilibrium where the percentage of each of the species present is dictated mainly by 
the pH of the solution. Borate ion begins to dominate the other boron species present 
in the fluid at a pH of approximately 9.5 and exceeds 95% of total boron species 
present at a pH of about 1 1.5. According to B. R. Sanderson ("Coordination 
Compounds of Boric Acid" in Mellor's Comprehensive Inorganic Chemistry, p. 721- 
764, 1975), boron species (including borate ions and boric acid among others) react 
with di- and poly-hydroxyl compounds having a cis-hydroxyl pair to form complexes 
which are in rapid equilibrium with uncomplexed boron species and the cis-hydroxyl 
compounds. The relative amounts of the complexed and free materials are provided 
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by the equilibrium constants for the specific systems. The equilibrium constants for 
borate ion is several orders of magnitude larger (typically by factors of 10"* to lO'®) 
than the equilibrium constant for boric acid with the same cis-hydroxyl compound. 

For all practical puiposes, borate ions form complexes (i.e., can serve to 
crosslink polysaccharides), while boric acid does not. Therefore, in order to have a 
useable crosslinked solid phase with the minimum boron content, most of the boron 
must be present as borate ions which requires a pH of at least about 8.5, preferably at 
least about 9.5. Reducing pH below these levels will reversibly break covalent bonds 
between the hydroxyl groups of the borate ions and the solid phase. 

Additional tag molecules with pH sensitive anchoring sites include molecules 
with pH sensitive bonds such as acyloxyalkyl ether, acetal, thioacetal, aminal, imine, 
carbamate, carbonate, and/or ketal bonds. Solid phases comprising silyl groups 
additionally can form pH sensitive bonds with hydroxyl, carboxylate, amino, 
mercapto, or enolizable carbonyl groups on tag molecules. 

In contrast to tag molecules in the art comprising affinity tags (e.g., such as 
ICAT™ reagents), tag molecules comprising pH sensitive anchoring sites generally 
retain the functional group that binds to the solid phase when disassociated from the 
solid phase (e.g., by a change in pH). The smaller size of non-affinity based tag 
molecules such as those containing boronic acid groups facilitates the analysis of 
tagged peptides by MS". 

Types of Labels 

The type of label selected is generally based on the following considerations: 

The mass of the label should preferably unique to shift fragment masses 
produced by MS analysis to regions of the spectrum with low background. The ion 
mass signature component is the portion of the labeling moiety which preferably 
exhibits a unique ion mass signature in mass spectrometric analyses. The sum of the 
masses of the constituent atoms of the label is preferably uniquely different than the 
fragments of all the possible amino acids. As a result, the labeled amino acids and 
peptides are readily distinguished from unlabeled amino acids and peptides by their 
ion/mass pattern in the resulting mass spectrum. In a preferred embodiment, the ion 
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mass signature component imparts a mass to a protein fragment produced during mass 
spectrometric fragmentation that does not match the residue mass for any of the 20 
natural amino acids. 

The label should be robust under the fragmentation conditions of MS and not 
undergo unfavorable fragmentation. Labeling chemistry should be efficient under a 
range of conditions, particularly denaturing conditions and the labeled tag preferably 
remains soluble in the MS buffer system of choice. In one aspect, the label increases 
the ionization efficiency of the protein, or at least does not suppress it. Alternatively 
or additionally, the label contains a mixture of two or more isotopically distinct 
species to generate a unique mass spectrometric pattern at each labeled fragment 
position. 

In one preferred aspect, tags comprise mass-altering labels v^hich are stable 
isotopes. In certain preferred embodiments, the method utilizes isotopes of hydrogen, 
nitrogen, oxygen, carbon, phosphorous or sulfur. Suitable isotopes include, but are 
not limited to, ^H, ^^C, ^^O, or ^"^S. Pairs of tags can be provided, comprising 
identical tag and peptide portions but distinguishable labels. For example, a pair of 
tags can comprise isotopically heavy and isotopically light labels, e.g., such as a 
'^0:^^0 pairor^Hr^H. 

Types of Solid Phases 

Examples of solid supports suitable for the methods described herein include, 
but are not limited to: glass supports, plastic supports and the like. These terms are 
intended to include beads, pellets, disks, fibers, gels, or particles such as cellulose 
beads, pore-glass beads, silica gels, polystyrene beads optionally cross-linked with 
divinylbenzene and optionally grafted with polyethylene glycol and optionally 
functionalized with amino, hydroxy, carboxy, or halo groups, grafted co-poly beads, 
poly-acrylamide beads, latex beads, dimethylacrylamide beads optionally cross-linked 
with N,N*-bis-acryloyl ethylene diamine, glass particles coated with hydrophobic 
polymer, and the like, e.g., material having a rigid or semi-rigid surface; and soluble 
supports such as low molecular weight non-cross-linked polystyrene. 

However, in one preferred aspect, the solid phase is a resin. As used herein, a 
"resin" refers to an insoluble material (e.g., a polymeric material) or particle which 
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allows ready separation from liquid phase materials by filtration. Resins can be used 
to carry tags and/or tagged peptides. Suitable resins include, but are not limited to, 
agarose, guaracrylamide, carbohydrate-based polymers (e.g., polysaccharide- 
containing), and the like. 

A "functionalized" solid phase or "functionalized resin'* refers to an insoluble, 
polymeric material or particle comprising active sites for reacting with the anchoring 
site of a tag molecule allowing anchored tag molecules to be readily separated (by 
filtration, centrifugation, etc.) from excess reagents, soluble reaction by-products or 
solvents. See also, Sherrington (1998), Chem, Commun. 2275-2286, Winter, In 
Combinatorial Peptide and Non-Peptide Libraries (G. Jung, ed.), pp. 465-509. VCH, 
Weinheim (1996), and Hudson (1999)7. Comb, Chem, 1:330-360. 

In one aspect, a functionalized solid phase comprises a reactive group for 
stably associating with a cleavable linker such as a linker shown in Figure 3. 

In another aspect, a functionalized solid phase comprises cis hydroxy 
groupspreferably attached by , a cyclic ringto the sold phase, or another chemical 
group suitable for forming a stable covalent association with an alkyl or aryl boronic 
acid, such as phenyl-B(OH)2. In one aspect, the solid phase comprises a cyclic 
alkane, such as 1 ,2-dihydroxycyclohexane. Preferably, the cyclic alkane comprises a 
5-membered ring (see, e.g.. Figure 5A). 

In a further aspect, shown in Figure 5B, the cyclic alkane is used as a 
molecular tag while R-B(OH)2 molecules are used to capture the tag molecules. 

Methods of Using Non-Affinity Based Isotope Tags 

Isolated tagged peptides according to the invention can be used to facilitate 
quantitative determination by mass spectrometry of the relative amounts of proteins in 
different samples. Also, the use of differentially isotopically-labeled reagents as 
internal standards facilitates quantitative determination of the absolute amounts of one 
or more proteins present in the sample. Samples that can be analyzed by method of 
the invention include, but are not limited to, cell homogenates; cell fractions; 
biological fluids, including, but not limited to urine, blood, and cerebrospinal fluid; 
tissue homogenates; tears; feces; saliva; lavage fluids such as lung or peritoneal 
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lavages; and generally, any mixture of biomolecules, e.g., such as mixtures including 
proteins and one or more of lipids, carbohydrates, and nucleic acids such as obtained 
partial or complete fractionation of cell or tissue homogenates. 

Preferably, a proteome is analyzed. By a proteome is intended at least about 
20% of total protein coming from a biological sample source, usually at least about 
40%, more usually at least about 75%, and generally 90% or more, up to and 
including all of the protein obtainable from the source. Thus the proteome may be 
present in an intact cell, a lysate, a microsomal fraction, an organelle, a partially 
extracted lysate, biological fluid, and the like. The proteome will be a mixture of 
proteins, generally having at least about 20 different proteins, usually at least about 50 
different proteins and in most cases, about 100 different proteins or more. 

Generally, the sample will have at least about 0.05 mg of protein, usually at 
least about 1 mg of protein or 10 mg of protein or more, typically at a concentration in 
the range of about 0. 1 -10 mg/ml. The sample may be adjusted to the appropriate 
buffer concentration and pH, if desired. 

Using Cleavable Linkers 

Figure 1 demonstrates one proposed strategy for quantitating proteins in a 
sample. Suitable samples, include but are not limited to cell lysates, purified or 
partially purified proteins. However, the invention is particularly advantageous in 
that it allows protein quantification to be performed directly from cell lysates, thus 
minimizing the number of sample processing steps required and maximizing 
throughput, an essential feature of proteome analysis. 

In the scheme shown in the Figure, proteins from cells are contacted with an 
agent (e.g., an alkylating agent) to activate one or more reactive groups on the protein 
so as to render these one or more groups reactive with RS groups on the tag molecule. 
In one aspect, the tag molecule is stably associated with a solid phase prior to reacting 
with cellular proteins, or can be reacted with cellular proteins first and then stably 
associated the solid phase. In one aspect, the tag molecule comprises a linker 
molecule and is bound via the linker molecule to the solid phase. Alternatively, the 
solid phase comprises the linker molecule and that tag molecule is contacted with the 
solid phase immobilized linker molecule before or after contacting the tag molecule 
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with the solid phase and linkers. It should be obvious to those of skill in the art that 
the exact sequence of events can vary and that such variations are encompassed 
within the scope of the invention. 

As shown in Figure 1 , the net result is the fonnation of a solid phase-linker- 
tag-protein complex. In the example shown in the Figure, the solid phase is a resin 
particle (R) and the linker comprises a cleavage site. 

The complex is exposed to a protease, generating solid phase-linker-tag- 
peptide complexes along with untagged peptides. Suitable proteases include, but are 
not limited to one or more of: serine proteases (e.g., such as trypsin, hepsin, SCCE, 
TADG12, TADG14); metallo proteases (e.g., such as PUMP-1); chymotrypsin; 
cathepsin; pepsin; elastase; pronase; Arg-C; Asp-N; Glu-C; Lys-C; carboxypeptidases 
A, B, and/or C; dispase; thermolysin; cysteine proteases such as gingipains, and the 
like. Generally, the type of protease is not limiting; however, preferably, the protease 
is an extracellular protease. In cases in which the steps prior to protease digestion 
were performed in the presence of high concentrations of denaturing solubilizing 
agents, the sample mixture is diluted until the denaturant concentration is compatible 
with the activity of the proteases used. 

Untagged peptides and other sample components are washed away. The 
remaining solid phase-linker-tag-peptide complexes are exposed to a cleavage 
stimulus (e.g., a chemical agent, light, heat, an enzyme, etc.) and the solid phase- 
linker portion of the complex is separated from the tag-peptide portion of the 
complex. Tagged peptides are subsequently analyzed by an appropriate method such 
as LC-MS/MS, discussed further below. 

Preferably, stable isotopes are incorporated into tag molecules prior to 
contacting the tag with sample proteins. 

In one particularly preferred aspect, proteins are obtained from cells in two 
different states (e.g., cells which are cancerous and non-cancerous, cells at two 
different developmental stages, cells exposed to a condition and cells unexposed to 
the condition, etc) and are activated (e.g., alkylated) for reaction with the RS groups 
of tag molecules. Following activation, the two cell samples are incubated with tag 
molecules labeled with stable isotopes, linker molecules, and solid phases (in any 
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sequence as described above) under suitable conditions to allow solid phase-linker- 
tag-protein complexes to form. Preferably, tags in the two sample tubes are labeled 
with different labels (e.g., heavy and light isotopes). 

The samples are combined in the same tube and then proteolyzed (e.g., 
trypsinized) and peptides which are not immobilized on the solid phase are removed 
by washing. Peptides are cleaved from the resin by virtue of the cleavable linker 
(e.g., using 50 mM DTT for a disulflde-based linker) and stable isotopes are retained 
with the peptides. These provide the means for quantification in a mass spectrometer 
members of a peptide pair differ in mass by the exact amount of mass contributed by 
the stable isotope. Identical peptide pairs comprise members with heavy and light 
isotopes or comprise a labeled member and unlabeled member. Peptide sequencing of 
either member of the pair can be performed by tandem mass spectrometry to identify 
the parent protein from which the peptide was obtained. This can be repeated on a 
global scale utilizing only seconds to measure and sequence each peptide. By 
determining ratios of labeled and unlabeled or differentially labeled peptides, the 
parent protein can be quantitated in each sample. Thus, protein expression profiles 
can be obtained for whole cell lysates which include information identifying and 
quantitating each protein member in the sample. 

Use of pH Sensitive Anchoring Sites on Tag Molecules 

A scheme for using tag molecules comprising pH sensitive anchoring sites is 
shown in Figure 2. In one aspect, proteins are activated for reaction with RS groups 
of the tag molecule. Where the RS-group is a cysteine reactive moiety, disulfide 
bonds of proteins in a sample are reduced to free SH groups using a reducing agent 
(e.g., such as tri-n-butylphosphine, mercaptoethylamine, dithiothreitol, and the like). 
If required, this reaction can be performed in the presence of solubilizing agents 
including high concentrations of urea and detergents to maintain protein solubility. 

r 

The proteins are contacted with suitable tag molecules, such as RS-R-B(OH2) 
molecules under conditions suitable for forming stable associations between the RS 
group and the activated proteins of the sample. Tag-protein complexes are reacted 
with one or more proteases (e.g., such as trypsin) to generate tag-peptide complexes 
and untagged peptides. Tagged peptides are contacted with a solid phase under 
conditions suitable for forming stable associations with the solid phase and untagged 
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peptides are washed away. As above, the order of contacting with the solid phase can 
be varied. For example, tag molecules can be bound to the solid phase prior to 
contacting with proteins in a sample. Preferably, the pH is about 8.5 or higher, to 
maintain covalent bonding between the tag molecule and the solid phase during the 
contacting steps and wash steps. Reactions generally can be performed at room 
temperature. 

The pH of the sample is reduced to less than about 8.5, and preferably to less 
than a pH of 3, to remove the tagged peptide from the support. As above, tagged 
peptides may subsequently be analyzed by LC-MS/MS. Also, as above, parallel 
samples contacted with differentially labeled tags can be combined for protease 
digestion steps, purification of tagged molecules, and subsequent analysis by LC- 
MS/MS to determine ratios of labeled tagged peptides in the combined sample. 
Optimal conditions (e.g., pH and temperature) for removing tag molecules may be 
determined using an assay such as described in Example 1 . 

Quantitation of Proteins in Samples 

Whether using either the cleavable linker scheme or the pH sensitive 
anchoring site scheme, quantitation of proteins involves the same general principals. 
For the comparative analysis of several samples, one sample is designated a reference 
to which the other samples are related to. Typically, the reference sample is labeled 
with the isotopically heavy reagent and the experimental samples are labeled with the 
isotopically light form of the reagent, although this choice of reagents is arbitrary. 

After tagging, aliquots of the samples labeled with the isotopically different 
reagents (e.g., heavy and light reagents, or labeled and unlabeled reagents) are 
combined and all the subsequent steps are performed on the pooled samples. 
Combination of the differentially labeled samples at this early stage of the procedure 
eliminates variability due to subsequent reactions and manipulations. Preferably 
equal amounts of each sample are combined. 

Following protease digestion and purification of tagged peptides in a 
combined sample, the mixture of proteins is submitted to a separation process, which 
preferably, allows the separation of the protein mixture into discrete fractions. Each 
fraction is preferably substantially enriched in only one labeled protein of the protein 
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mixture. The methods of the present invention are utilized in order to identify and/or 
quantify and/or determine the sequence of a tagged peptide. Within preferred 
embodiments of the invention, the tagged peptide is "substantially pure," after the 
separation procedure which means that the polypeptide is about 80% homogeneous, 
and preferably about 99% or greater homogeneous. Many methods well known to 
those of ordinary skill in the art may be utilized to purify tagged peptides. 
Representative examples include HPLC, Reverse Phase-High Pressure Liquid 
Chromatography (RP-HPLC), gel electrophoresis, chromatography, or any of a 
number of peptide purification methods as are known in the art. 

Preferred is microcapillary liquid chromatography. 

Analysis of isolated, tagged peptides by microcapillary LC-MS" or CE-MS" 
with data dependent fragmentation is performed using methods and instrument control 
protocols well-known in the art and described, for example, in Ducret et al., 1998; 
Figeys and Aebersold, 1998; Figeys et al., 1996; or Haynes et aL, 1998. Also 
encompassed within the scope of the invention, although less preferred, are mass 
spectrometry methods such as fast atomic bombardment (FAB), plasma desorption 
(PD), thermospray (TS), and matrix assisted laser desorption (MALDI) methods. 

In the analysis step, both the quantity and sequence identity of the proteins 
firom which the tagged peptides originated can be determined by automated multistage 
MS (MS"). This is achieved by the operation of the mass spectrometer in a dual mode 
in which it alternates in successive scans between measuring the relative quantities of 
peptides eluting from the capillary column and recording the sequence information of 
selected peptides. Peptides are quantified by measuring in the MS mode the relative 
signal intensities for pairs of peptide ions of identical sequence that are tagged with 
the molecules comprising light or heavy forms of isotope, respectively, or labeled and 
unlabeled members of a peptide pair, and which therefore differ in mass by the mass 
differential encoded within the labeled tagged reagent. 

Peptide sequence information is automatically generated by selecting peptide 
ions of a particular mass-to-charge (m/z) ratio for collision-induced dissociation 
(CID) in the mass spectrometer operating in the MS" mode. (Link, A. J. et al., 1997; 
Gygi, S. P., et al. 1999; and Gygi, S. P. et al., 1999). The resulting CID spectra are 
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then automatically correlated with sequence databases to identify the protein from 
which the sequenced peptide originated. Combination of the results generated by MS 
and MS" analyses of labeled tagged peptide samples therefore determines the relative 
quantities, as well as the sequence identities, of the components of protein mixtures in 
a single, automated operation. 

The approach employed herein for quantitative proteome analysis is based on 
two principles. First, a short sequence of contiguous amino acids from a protein (5-2S 
residues) contains sufficient information to uniquely identify that protein. Protein 
identification by MS" is accomplished by correlating the sequence information 
contained in the CID mass spectrum with sequence databases, using computer 
searching algorithms known in the art (Eng, J, et al., 1994; Mann, M. et al., 1994; 
Qin, J. et al., 1997; Clauser, K. R. et al., 1995). Pairs of identical peptides tagged 
with the light and heavy affinity tagged reagents, or labeled and unlabeled peptides, 
respectively, (or in analysis of more than two samples, sets of identical tagged 
peptides in which each set member is differentially isotopically labeled) are 
chemically identical and therefore serve as mutual internal standards for accurate 
quantitation. 

The MS measurement readily differentiates between peptides originating from 
different samples, representing for example different cell states, because of the 
difference between isotopically distinct reagents attached to the peptides. The ratios 
between the intensities of the differing weight components of these pairs or sets of 
peaks provide an accurate measure of the relative abundance of the peptides (and 
hence the proteins) in the original cell pools because the MS intensity response to a 
given peptide is independent of the isotopic composition of the reagents (De 
Leenheer, A. P. et al ( 1 992). 

Several beneficial features of the method are apparent. At least two peptides 
can be detected from each protein in a pooled sample mixture. Therefore, both 
quantitation and protein identification can be redundant. Further, where the peptide 
group which reacts with the RS group of a tag molecule is relatively rare (e.g., such as 
a cysteinyl residue), the presence of such a group in a tagged peptide adds an 
additional powerful constraint for database searching (Sechi, S. et al., 1998). The use 
of relatively rare peptide groups and the tagging and selective enrichment for peptides 
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containing these groups significantly reduces the complexity of the peptide mixture 
generated by the concurrent digestion of multiple proteins and facilitates MS" 
analysis. For example, a theoretical tryptic digest of the entire yeast proteome (6113 
proteins) produces 344,855 peptides, but only 30,619 of these peptides contain a 
cysteinyl residue. Additionally, the chemistries used in both schemes discussed above 
are compatible with LC-MS/MS analysis. 

The methods described above, generally start with about 100 |ag of protein and 
require no fractionation techniques. However, the methods are compatible with any 
biochemical, immunological or cell biological fractionation methods that reduce 
sample complexity and enrich for proteins of low abundance while quantitation is 
maintained. This method can be redundant in both quantitation and identification if 
multiple groups on a single protein bind to an RS group of a tag molecule. 

The methods of this invention can be applied to analysis of low abundance 
proteins and classes of proteins with particular physico-chemical properties including 
poor solubility, large or small size and extreme p/values. 

An application of the chemistry and described above is the establishment of 
quantitative profiles of complex protein samples and ultimately total lysates of cells 
and tissues. 

In addition, the reagents and methods of this invention may be used to 
determine sites of protein modifications and therefore the abundance of modified 
proteins in a sample. For example, in one aspect, when the RS group reacts with a 
modified residue on a protein, differentially isotopically labeled tagged peptides are 
used to determine the sites of induced protein modification. Modified peptides are 
identified in a protease-digested sample mixture by fragmentation in the ion source of 
an ESI-MS instmment and their relative abundances are determined by comparing the 
ion signal intensities of an experimental sample with the intensity of an included, 
isotopically labeled standard. Modifications included within the scope of the 
invention include, but are not limited to, glycosylation, methylation, acylation, 
phosphorylation, ubiquination, famesylation, and ribosylation. 

In one aspect, the RS group is a Boron tag of reversed polarity, that is the two 
hydroxyl groups of R-B(OH2) are exposed in solution to bind to glycosylated 
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peptides. In this scenario, the Boron tag is attached to the solid phase, SP, via another 
type of molecule such as a catechol group. 

In still another aspect, a cyclic alkane comprising cis hydroxy groups are used 
as tag molecules while an R-B(OH2) molecule is attached to a support and used to 
capture the tag molecules (see, e.g.. Figure 5). 

Quantitative Analysis of Surface Proteins in Cells and Tissue 

The cell exterior membrane and its associated proteins (cell surface proteins) 
participate in sensing external signals and responding to environmental cues. 
Changes in the abundance of cell surface proteins can reflect a specific cellular state 
or the ability of a cell to respond to its changing environment. Thus, the 
comprehensive, quantitative characterization of the protein components of the cell 
surface can identify marker proteins or constellations of marker proteins characteristic 
for a particular cellular state, or explain the molecular basis for cellular responses to 
external stimuli. Indeed, changes in expression of a number of cell surface receptors 
such as'Her2/neu, erbB, IGFI receptor, and EGF receptor have been implicated in 
carcinogenesis and a current immunological therapeutic approach for breast cancer is 
based on the infusion of an antibody (Herceptin, Genentech, Palo Alto, Calif.) that 
specifically recognizes Her2/neu receptor. 

Cell surface proteins are also experimentally accessible. Diagnostic assays for 
cell classification and preparative isolation of specific cells by methods such as cell 
sorting or panning are based on cell surface proteins. Thus, differential analysis of 
cell surface proteins between normal and diseased (e.g., cancer) cells can identify 
important diagnostic or therapeutic targets. While the importance of cell surface 
proteins for diagnosis and therapy of cancer has been recognized, membrane proteins 
have been difficult to analyze. Due to their generally poor solubility they tend to be 
under-represented in standard 2D gel electrophoresis patterns and attempts to adapt 
2D electrophoresis conditions to the separation of membrane proteins have met 
limited success. The method of this invention can overcome the limitations inherent 
in the traditional techniques. 

Methods can be applied to enhance the selectivity for tagged peptides derived 
from cell surface proteins. For example, tagged cell surface proteins can be protease- 
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digested directly on the intact cells to generate tagged peptides, purified and analyzed 
as discussed above. In addition, traditional cell membrane preparations may be used 
as an initial step to enrich cell surface proteins. These methods can include gentle cell 
lysis with a dounce homogenizer and series of density gradient centrifugations to 
isolate membrane proteins prior to proteolysis. This method can provide highly 
enriched preparations of cell surface proteins. In the application of the methods of 
this invention to cell surface proteins, once the tagged proteins are fragmented, the 
tagged peptides behave no differently from the peptides generated from more soluble 
samples. 

Methods according to the invention can be used for qualitative and/or 
quantitative analysis of global protein expression profiles in cells and tissues, i.e., 
analysis of proteomes. The method can also be employed to screen for and identify 
proteins whose expression level in cells, tissue or biological fluids is affected by a 
stimulus (e.g., administration of a drug or contact with a potentially toxic material), 
by a change in environment (e.g., nutrient level, temperature, passage of time) or by a 
change in condition or cell state (e.g., disease state, malignancy, site-directed 
mutation, gene knockouts) of the cell, tissue or organism from which the sample 
originated. The proteins identified in such a screen can function as markers for the 
changed state. For example, comparisons of protein expression profiles of normal and 
malignant cells can result in the identification of proteins whose presence or absence 
is characteristic and diagnostic of the malignancy. 

The methods herein can be employed to screen for changes in the expression 
or state of enzymatic activity of specific proteins. These changes may be induced by 
a variety of compounds or chemicals, including pharmaceutical agonists or 
antagonists, or potentially harmful or toxic materials. The knowledge of such changes 
may be useful for diagnosing abnormal physiological responses and for investigating 
complex regulatory networks in cells. 

Compounds which can be evaluated include, but are not limited to: dmgs; 
toxins; proteins; polypeptides; peptides; amino acids; antigens; cells, cell nuclei, 
organelles, portions of cell membranes; viruses; receptors; modulators of receptors 
(e.g., agonists, antagonists, and the like); enzymes; enzyme modulators (e.g., such as 
inhibitors, cofactors, and the like); enzyme substrates; hormones; nucleic acids (e.g.. 
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such as oligonucleotides; polynucleotides; genes, cDNAs; RNA; antisense molecules, 
ribozymes, aptamers), and combinations thereof. Compounds also can be obtained 
from synthetic libraries from drug companies and other commercially available 
sources known in the art (e.g., including, but not limited, to the LeadQuest® library) 
or can be generated through combinatorial synthesis using methods well known in the 
art. A compound is identified as a modulating agent if it alters the expression or site 
of modification of a polypeptide and/or if it alters the amount of modification by an 
amount that is significantly different from the amount observed in a control cell (e.g., 
not treated with compound) (setting p values to < 0.05). 

Compounds identified as modulating agents are used in methods of treatment 
of pathologies associated with abnormal sites/levels of the particular modification. 
For administration to a patient, one or more such compounds are generally formulated 
as a pharmaceutical composition. Preferably, a pharmaceutical composition is a 
sterile aqueous or non-aqueous solution, suspension or emulsion, which additionally 
comprises a physiologically acceptable carrier (i.e., a non-toxic material that does not 
interfere with the activity of the active ingredient). More preferably, the composition 
also is non-pyrogenic and free of viruses or other microorganisms. Any suitable 
carrier known to those of ordinary skill in the art may be used. Representative 
carriers include, but are not limited to: physiological saline solutions, gelatin, water, 
alcohols, natural or synthetic oils, saccharide solutions, glycols, injectable organic 
esters such as ethyl oleate or a combination of such materials. Optionally, a 
pharmaceutical composition additionally contains preservatives and/or other additives 
such as, for example, antimicrobial agents, anti-oxidants, chelating agents and/or inert 
gases, and/or other active ingredients. 

Routes and frequency of administration, as well doses, will vary from patient 
to patient. In general, the pharmaceutical compositions is administered intravenously, 
intraperitoneally, intramuscularly, subcutaneously, intracavity or transdermally. 
Between 1 and 6 doses is administered daily. A suitable dose is an amount that is 
sufficient to show improvement in the symptoms of a patient afflicted with a disease 
associated an aberrant level of expression of a particular protein or the site or amount 
of modification of the protein. Such improvement may be detected by monitoring 
appropriate clinical or biochemical endpoints as is known in the art. In general, the 
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amount of modulating agent present in a dose, or produced in situ by DNA present in 
a dose (e.g., where the modulating agent is a polypeptide or peptide encoded by the 
DNA), ranges from about 1 ^g to about 100 mg per kg of host. Suitable dose sizes 
will vary with the size of the patient, but will typically range from about 10 mL to 
about 500 mL for 10-60 kg animal. A patient can be a mammal, such as a human, or 
a domestic animal. 

The methods herein can also be used to implement a variety of clinical and 
diagnostic analyses to detect the presence, absence, deficiency or excess of a given 
protein or protein fimction in a biological fluid (e.g., blood), or in cells or tissue. The 
methods are particularly useful in the analysis of complex mixtures of proteins, i.e., 
those containing 5 or more distinct proteins or protein functions. Therefore in one 
aspect, the methods are used to compare and quantitate levels of proteins and/or sites 
and amounts of protein modifications in samples between a normal cell sample and a 
cell sample from a patient with a pathological condition (preferably, the cell sample is 
the target of the pathological condition) in order to identify the presence, absence, 
deficiency or excess of a given protein or protein function which is associated with 
the pathological condition. 

Kits 

The invention further provides a kit comprising reagents and/or compositions as 
described above. For example, in one aspect the invention provides a tag molecule 
and one or more of a reagent selected from the group consisting of: an activating 
agent for providing active groups on a protein which bind to the reactive site of the 
tag molecule; a solid phase; one or more agents for lysing a cell; a pH altering agent; 
one or more proteases; one or more cell samples or fractions tiiereof. In one aspect, 
the tag molecule is further stably associated with a peptide, i.e., a tagged reference 
peptide is included suitable for a particular assay of choice. 

The invention also provides kits comprising a plurality of tagged peptide 
molecules, each tagged peptide molecule comprising a peptide and a tag molecule 
stably associated with the protein, the tag molecule further comprising an isotope 
label, and a pH sensitive anchoring site for anchoring the tag molecule to a solid 
phase. In one aspect, the kit comprises pairs of tagged peptides and each member of a 
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pair of tagged peptides comprises an identical peptide and is differentially labeled 
from the other member of the pair. In another aspect, the kit comprises at least one 
set of tagged peptides, the set comprising different peptides corresponding to a single 
protein. In still another aspect, at least one set of tagged peptides comprises peptides 
corresponding to modified and unmodified forms of a single protein. In a further 
aspect, the kit comprises at least one set of tagged peptides from a first cell at a first 
cell state and at least one set of tagged peptides from a second cell at a second cell 
state. For example, the first cell may be a nomially proliferating cell while the second 
cell is an abnormally proliferating cell (e.g., a cancer cell). First and second cells may 
also represent different stages of cancer, different developmental stages, cells exposed 
to agents (e.g., drugs, potentially toxic or carcinogenic materials) or conditions (e.g., 
pH, temperature, nutrient levels, passage of times) and cells not exposed to agents or 
conditions, as well as cells which do or do not express particular recombinant DNA 
constructs. 

Examples 

The invention will now be further illustrated with reference to the following 
examples. It will be appreciated that what follows is by way of example only and that 
modifications to detail may be made while still falling within the scope of the 
invention. 

Example 1. Arylboronic Acids as New ICAT Reagents 

Arylboronic Acid-Immobilized Glutathione On a Carbohydrate Affinity 
Column 

A column of carbohydrate was immobilized on agarose (Calbiochem, gal-a- 
1,3-gal on agarose, cat. # 215364, 2 mis packed resin) using 0.05% SDS in 50 mM 
ammonium bicarbonate, pH = 8.1; however, SDS may be omitted. The column was 
equilibrated with at least 10 column volumes of the 50 mM AmBic, without 
detergent, before sample was applied. An arylboronic conjugate was synthesized 
using standard chemistries. 68 mgs GSH in 1.9 mis of water was combined with 100 
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^ of IM potassium phosphate, pH = 7.4 and stirred for 5 minutes. 8.8 mgs of 
arylboronic acid were added which dissolved within about IS minutes. 

The scheme for generating the conjugates is shown below: 




dntathionc (GSH), M.W.- 30733 MPBA, M. W.-21 6w99 

7] mgs 10 mgs 

230^ ^^mo^ 46.1 |imol 




One ml of AmBic (IM) was added and the solution was stirred another 5 
minutes, after which 100 of 150 fluoresceine was added. The column was 
washed with 50 mM AmBic solution at a flow rate of about 1 ml/minute. Five ml 
fractions were collected and the amount of fluorescein in the fractions was 
determined. A large amount of fluoresceine initially eluted. After collecting fraction 
9, elution buffer consisting of 100 mM glycine, pH = 2.5, and containing 25 mM 
glucose was used to wash the column. Five ml fractions were collected through 
column 1 5. Absorbance was determined at 254 and 490 nm, to determine the 
presence of aryl groups and fluoresceine respectively, in the fractions. The elution 
profile is shown in Figure 4. 

Fraction 10 showed significant amount of product. Fractions 10-12 were 
combined and saved as a combined sample (combined sample 1) at -SO'^C for LC-MS 
analysis, as were the flow-through fractions 3-6 (combined sample 2), Thus, even 
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without optimal conditions for recovery, significant amounts of product were 
recovered. 



These results demonstrate that boronic acid conjugates can be used to provide 
pH sensitive molecular tags which can be recovered at high efficiency. 
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Variations, modifications, and other implementations of what is described 
herein will occur to those of ordinary skill in the art without departing firom the spirit 
and scope of the invention as described and claimed herein. 
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All of the references identified hereinabove, are expressly incorporated herein 
by reference. 



What is claimed is: 
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